Skip to content

Completed Design-1#2634

Open
pratul2789 wants to merge 1 commit intosuper30admin:masterfrom
pratul2789:master
Open

Completed Design-1#2634
pratul2789 wants to merge 1 commit intosuper30admin:masterfrom
pratul2789:master

Conversation

@pratul2789
Copy link

No description provided.

@super30admin
Copy link
Owner

Your solution is excellent. You have correctly implemented the double hashing technique for the hash set. The code is clean and efficient. Here are a few minor suggestions:

  1. Consider adding comments to explain why you chose 1000 and 1001 for the bucket sizes. This can help others understand that 1001 is chosen to handle the maximum key value (10^6) without index issues.

  2. The return statements at the end of add and remove are unnecessary since the functions are void. You can remove them for brevity.

  3. Although not necessary, you might want to consider using a more descriptive name for the inner bucket constant, such as innerBucketsSize or similar, to make it clear that it represents the size.

Overall, great job! Your solution is correct and efficient.

@super30admin
Copy link
Owner

Note: Only output the verdict at the end, after the evaluation and feedback.

Now, evaluate the student's solution. Remember, the student is submitting two files, but the problem is only about the HashSet. The MinStack is an extra file that should be ignored for this evaluation.

Let's focus on the HashSet.cpp file.

The student's solution for MyHashSet in C++:

#include <vector>
using namespace std;

class MyHashSet {
private:
    static const int outerBuckets = 1000;
    static const int innerBuckets= 1001;

    vector<vector<bool>> hashSet;

    int getOuterBucket(int key) {
        return key % outerBuckets;
    }

    int getInnerBucket(int key) {
        return key / innerBuckets;
    }

public:
    MyHashSet() {
        hashSet.resize(outerBuckets);
    }
    
    void add(int key) {
        int outerBucket = getOuterBucket(key);

        if (hashSet[outerBucket].empty()) {
            hashSet[outerBucket] = vector<bool> (innerBuckets, false);
        }
        int innerBucket = getInnerBucket(key);
        hashSet[outerBucket][innerBucket] = true;
        return;
    }
    
    void remove(int key) {
        int outerBucket = getOuterBucket(key);

        if (hashSet[outerBucket].empty()) {
            return;
        }

        int innerBucket = getInnerBucket(key);
        hashSet[outerBucket][innerBucket] = false;
        return;
    }
    
    bool contains(int key) {
        int outerBucket = getOuterBucket(key);

        if (hashSet[outerBucket].empty()) {
            return false;
        }

        int innerBucket = getInnerBucket(key);
        return hashSet[outerBucket][innerBucket];
    }
};

Now, compare it to the reference solution in Java.

Reference solution uses:

  • primaryBuckets = 1000, secondaryBuckets = 1000.
  • For the primary index 0, it allocates an array of size 1001 to handle the key 1000000 (since 1000000 % 1000 = 0, and 1000000 / 1000 = 1000, which requires index 1000, so size 1001).

In the student's solution:

  • outerBuckets = 1000, innerBuckets = 1001.
  • They resize the outer vector to 1000 in the constructor.
  • In add, if the inner vector is empty, they initialize it to a vector of size innerBuckets (1001) with all false.
  • The inner index is computed as key / innerBuckets (which is 1001).

But note: the maximum key is 10^6 = 1000000.

  • For key=1000000:
    • outerBucket = 1000000 % 1000 = 0.
    • innerBucket = 1000000 / 1001 = 999 (since 1001 * 999 = 999999, and 1000000 / 1001 is about 999.000999, so integer division gives 999).
    • So for key=1000000, it will be stored at [0][999]. This is correct because 999 < 1001.

But what about key=0?

  • outerBucket = 0 % 1000 = 0.
  • innerBucket = 0 / 1001 = 0 -> stored at [0][0].

What about key=1000?

  • outerBucket = 1000 % 1000 = 0.
  • innerBucket = 1000 / 1001 = 0 -> stored at [0][0]. But key=0 is also at [0][0]. This is a collision!

Wait, this is a problem. The student's inner index function is key / 1001. For keys 0 and 1000, both map to the same inner index 0 in the same outer bucket 0. So if we add 0 and then add 1000, both will be stored at the same location. This is incorrect.

In the reference solution, for outer bucket 0, they use an array of size 1001, and the inner index is key / 1000. So for key=0: inner index=0; for key=1000: inner index=1 (since 1000/1000=1). So no collision.

The student's approach has a flaw: the inner index calculation should be key / outerBuckets (which is 1000) not innerBuckets (1001). But they defined innerBuckets as 1001, which is correct for the size of the inner array, but the division should be by 1000 to get the index.

Alternatively, the inner index should be computed as key / outerBuckets (1000) and the inner array for bucket 0 should be of size 1001 to accommodate index 1000 (for key=1000000).

So the student has:

  • getInnerBucket(int key) { return key / innerBuckets; } // innerBuckets=1001

But it should be:

  • getInnerBucket(int key) { return key / outerBuckets; } // outerBuckets=1000

And then for outer bucket 0, we need an inner array of size 1001, and for others 1000.

In the student's code, they are using innerBuckets (1001) for both the size and the divisor. This causes the collision between keys that are multiples of 1001? Actually, no: for key=0 and key=1000, both are less than 1001, so both are divided by 1001 to get 0. So any two keys that are in the same outer bucket and have the same quotient when divided by 1001 will collide.

For example, keys 0 and 1000 are in outer bucket 0 and both have key/1001 = 0.
Similarly, keys 1001 and 2001: both in outer bucket 1 (1001%1000=1, 2001%1000=1) and both have key/1001=1 and 2? Wait: 1001/1001=1, 2001/1001=2 (since 2001/1001=1.999 -> integer 1? Actually 2001/1001=1.999, but integer division truncates to 1? Actually no: 2001 / 1001 = 1 with remainder 1000. So 2001/1001=1 (integer division). So keys 1001 and 2001 would both be at inner index 1? Actually:

  • key=1001: outer=1, inner=1001/1001=1 -> [1][1]
  • key=2001: outer=1, inner=2001/1001=2? Wait, let me compute: 2001 divided by 1001 is 1.999... but in integer division, it is 1? Actually no: 1001 * 2 = 2002, which is greater than 2001, so 2001/1001=1. So yes, both 1001 and 2001 would have inner index 1. This is a collision.

So the student's inner index function is incorrect. It should be key / outerBuckets (1000) to match the reference solution.

Also, in the reference solution, they have:

  • For outer index 0, they create an inner array of size 1001, for others 1000.

In the student's code, they always create an inner array of size innerBuckets (1001) for every outer bucket. This is wasteful for outer buckets other than 0, but it is correct in terms of avoiding index out of bounds? Actually, for keys in outer bucket 0, the inner index can be from 0 to 1000 (because key=0 to 1000000: for key=1000000, inner index=1000000/1000=1000). For outer buckets other than 0, the maximum key in that bucket is at most 999999, which for outer bucket 999, the key 999999: inner index=999999/1000=999, which is less than 1000. So for outer buckets other than 0, we only need an inner array of size 1000.

But in the student's code, they are using inner index = key / 1001. For outer bucket 999, the maximum key is 999999. 999999/1001=999 (since 1001*999=999999). So inner index 999 is within [0,1000] if the inner array is size 1001. So it is safe, but it causes collisions as shown.

Therefore, the student's solution has a critical flaw in the hashing function for the inner index, which will cause incorrect behavior due to collisions.

Now, let's check the other file: MinStack.cpp. But the problem is only about HashSet, so we should ignore MinStack for this evaluation. The student might have submitted multiple files by mistake.

So for the HashSet solution, it is incorrect.

Time Complexity: The operations are O(1) as in the reference, but due to the flaw, it doesn't work correctly.

Space Complexity: The student uses more space than necessary because they always allocate inner arrays of size 1001 for every outer bucket, while only bucket 0 needs 1001 and others need 1000. But that is minor compared to the correctness issue.

Code Quality:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments